# Learning to Noise: Application-Agnostic Data Sharing with Local Differential Privacy

To install the required libraries, run:
```bash
pip3 install -r requirements.txt
```
The MNIST dataset will be automatically downloaded at the start of experiments. Information on how to download the 
Lending Club dataset can be found [here](https://www.kaggle.com/wordsforthewise/lending-club). You must pre-process 
Lending Club using:
```sh
python preprocessing/lending_club.py
```
after changing the path of the `.gz` file accordingly.

To train the VAE-based privatization mechanism, run e.g.:

```sh
python runDPPreTraining.py --task MNIST --md 7.5 --posterior_std 1.01 --fraction 0.75
```
This runs the data collection experiment by default. You can run the novel class classification experiment with the 
argument `--novel_class` and the data join experiment with `--data_join_task`. To train a CDP encoder or decoder, 
use `--dp_encoder` or `--dp_decoder`.

To train a classifier on the learnt representations, run e.g.:
```sh
python runClassifier.py --param_dir runs/[output-directory-from-VAE-training] --md 7.5 --use_label_noise --epsilon 2
```

To train a model when noising features directly, run e.g.:

```sh
python runClassifier.py --noise_features_directly --task MNIST --fraction 0.75 --use_label_noise --epsilon 2
```

For further information, run:
```sh
python runDPPreTraining.py --help
```
```sh
python runClassifier.py --help
```

Training curves are output to Tensorboard. Enter:

```sh
tensorboard --logdir runs
```

and open http://localhost:6006/ to visualize them.
